Skip to content

mm/zsmalloc: per-cpu deferred free to accelerate swap entry release#810

Open
blktests-ci[bot] wants to merge 4 commits intolinus-master_basefrom
series/1091432=>linus-master
Open

mm/zsmalloc: per-cpu deferred free to accelerate swap entry release#810
blktests-ci[bot] wants to merge 4 commits intolinus-master_basefrom
series/1091432=>linus-master

Conversation

@blktests-ci
Copy link
Copy Markdown

@blktests-ci blktests-ci Bot commented May 8, 2026

Pull request for series with
subject: mm/zsmalloc: per-cpu deferred free to accelerate swap entry release
version: 3
url: https://patchwork.kernel.org/project/linux-block/list/?series=1091432

@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented May 8, 2026

Upstream branch: 6d35786
series: https://patchwork.kernel.org/project/linux-block/list/?series=1091432
version: 3

wenchao-hao and others added 3 commits May 10, 2026 16:17
Add a per-cpu deferred free mechanism to zsmalloc with a callback
interface that lets callers (zram, zswap) customize push and drain
behavior.

Each CPU owns a single-page buffer. The hot path (zs_free_deferred)
writes a value into the current CPU's buffer via the push callback
with preemption disabled — no locks, no atomics. When the buffer
fills, it is swapped with a fresh page from a pre-allocated page
pool and the full page is queued to a WQ_UNBOUND worker for drain.

The drain worker invokes the drain callback which performs the actual
expensive work (zs_free, slot_free, etc.) in batch, away from the
original hot path.

Page pool management:
  - Pool is pre-allocated at enable time (ZS_DEFERRED_POOL_SIZE pages)
  - Full buffers are drained and returned to the pool
  - If no free page is available when buffer is full, the push falls
    back to synchronous processing by the caller

Signed-off-by: Wenchao Hao <haowenchao@xiaomi.com>
Register zswap_deferred_ops to defer the entire zswap_entry_free()
to the WQ_UNBOUND worker. The invalidate hot path only stores the
entry pointer into the per-cpu buffer (512 entries/page).

The drain callback performs the full entry teardown: lru_del, zs_free,
memcg uncharge, cache_free, and stats update. On deferred failure,
fallback to synchronous zswap_entry_free().

Signed-off-by: Wenchao Hao <haowenchao@xiaomi.com>
Register zram_deferred_ops with zs_pool_enable_deferred_free() to
defer slot freeing to a WQ_UNBOUND worker. The notify hot path only
stores a u32 slot index into the per-cpu buffer (1024 entries/page).

The drain callback does slot_lock + slot_free + slot_unlock for each
index. On deferred failure (no free page), fallback to synchronous
slot_lock + slot_free + slot_unlock.

Signed-off-by: Barry Song <baohua@kernel.org>
Signed-off-by: Wenchao Hao <haowenchao@xiaomi.com>
@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented May 10, 2026

Upstream branch: aa54b1d
series: https://patchwork.kernel.org/project/linux-block/list/?series=1091432
version: 3

Replace four separate flag clear operations in slot_free() with a
single mask write. This reduces redundant read-modify-write cycles
on the same flags word.

Signed-off-by: Wenchao Hao <haowenchao@xiaomi.com>
@blktests-ci blktests-ci Bot force-pushed the series/1091432=>linus-master branch from d9623ad to 66ef037 Compare May 10, 2026 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant